Extracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering
نویسندگان
چکیده مقاله:
Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised one. To estimate the density distribution of data, Wiebull Mixture Model (WMM) is utilized due to its high flexibility. Another contribution of this study is to propose a new hill and valley seeking algorithm to find the constraints for semi-supervise algorithm. It is assumed that each density peak stands on a cluster center; therefore, neighbor samples of each center are considered as must-link samples while the near centroid samples belonging to different clusters are considered as cannot-link ones. The proposed approach is applied to a standard image dataset (designed for clustering evaluation) along with some UCI datasets. The achieved results on both databases demonstrate the superiority of the proposed method compared to the conventional clustering methods.
منابع مشابه
from linguistics to literature: a linguistic approach to the study of linguistic deviations in the turkish divan of shahriar
chapter i provides an overview of structural linguistics and touches upon the saussurean dichotomies with the final goal of exploring their relevance to the stylistic studies of literature. to provide evidence for the singificance of the study, chapter ii deals with the controversial issue of linguistics and literature, and presents opposing views which, at the same time, have been central to t...
15 صفحه اولSemi-Supervised Clustering with Limited Background Knowledge
In many machine learning domains, there is a large supply of unlabeled data but limited labeled data, which can be expensive to generate. Consequently, semi-supervised learning, learning from a combination of both labeled and unlabeled data, has become a topic of significant recent interest. Our research focus is on semi-supervised clustering, which uses a small amount of supervised data in the...
متن کاملExtracting Knowledge from Incomplete Data
Decision-makers are often met with situations where optimal decisions have to be made in the presence of missing information. To facilitate such work we propose application of Armstrong axioms. Key–Words: Decision support systems, uncertainty management, inference axioms
متن کاملSemi-supervised Pattern Learning for Extracting Relations from Bioscience Texts
A variety of pattern-based methods have been exploited to extract biological relations from literatures. Many of them require significant domain-specific knowledge to build the patterns by hand, or a large amount of labeled data to learn the patterns automatically. In this paper, a semisupervised model is presented to combine both unlabeled and labeled data for the pattern learning procedure. F...
متن کاملA Variational Approach to Semi-Supervised Clustering
We present a variational inference scheme for semi-supervised clustering in which data is supplemented with side information in the form of common labels. There is no mutual exclusion of classes assumption and samples are represented as a combinatorial mixture over multiple clusters. The method has other advantages such as the ability to find the most probable number of soft clusters in the dat...
متن کاملSemi-supervised incremental clustering of categorical data
Résumé. Le clustering semi-supervisé combine l’apprentissage supervisé and non-supervisé pour produire meilleurs clusterings. Dans la phase initiale supervisée de l’algorithme, un échantillon d’apprentissage est produit par selection aléatoire. On suppose que les exemples de l’échantillon d’apprentissage sont étiquetés par un attribut de classe. Puis, un algorithme incrémentiel développé pour l...
متن کاملمنابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ذخیره در منابع من قبلا به منابع من ذحیره شده{@ msg_add @}
عنوان ژورنال
دوره 6 شماره 2
صفحات 287- 295
تاریخ انتشار 2018-07-01
با دنبال کردن یک ژورنال هنگامی که شماره جدید این ژورنال منتشر می شود به شما از طریق ایمیل اطلاع داده می شود.
میزبانی شده توسط پلتفرم ابری doprax.com
copyright © 2015-2023